Vol. 1, Issue 3, pp.1125-1133

### **FPGA Based Variable - point FFT Processor**

Padma priya dukkipati\*, K N H Srinivas\*\*

\*(M.Tech student, Department of ECE, Sri Vasavi Engineering College, Tadepalligudem) \*\* (Associate Professor, Department of ECE, Sri Vasavi Engineering College, Tadepalligudem)

### ABSTRACT-

An efficient variable length FFT processor is required for the downlink module of Orthogonal Frequency Division Multiple Access (OFDMA) system. This system is used in the mobility mode of the IEEE 802.16 Wireless MAN standard, commonly referred as WiMAX .In OFDMA system the carrier space is constant across different channel bandwidths.FFT is required to keep the carrier spacing constant across different channel bandwidths. In this paper the variable input FFT processor is designed to meet the requirements of OFDMA system. This design considers different FFT lengths. For this we use the 2D Fourier Transform algorithm, VHDL language to present the design in detail, ModelSim for the simulation and verify on the Spartan 3E FPGA.

### Keywords - FFT, FPGA, OFDMA, VHDL.

### I. INTRODUCTION

Modern Communications systems feature highly dynamic scalability, which often requires changing system parameters on- the-fly based on channel conditions and user quality of service (QOS) requirements. There has been a growing interest in wireless multiuser systems, such as WLAN and mobile communication systems. Orthogonal Frequency Division Modulation (OFDM) is considered a promising solution due to its combat multipath fading problems. For multiuser access, in Orthogonal Frequency Division Multiple Access (OFDMA) each user communicates with the base station over a set of dedicated sub channels. So OFDMA is also referred to as multiuser –OFDM [1]. In modern communications systems Orthogonal Frequency Division Multiple Access (OFDMA) plays a crucial role and it will be replaced by Orthogonal Frequency Division Multiple Access (OFDMA) in the next generation wireless communication system such as WiMAX and 3G-LTE standard [2],[3]. IEEE Std 802.16e-2005[4] is a wireless metropolitan area network standard and it improves upon IEEE 802.16-2004 by adding support for mobility.

FFT architectures can be categorized as two types, the pipelined architecture and the memory based architecture. Reference [5] presented a cascade structure of FFT processor which could implement input data length alterable based on radix-16 and radix-2/4/8 mix radix algorithm. "Ping-pong" memory architecture was adopted in the design, each level need a lot of memory.Becuase the architecture needed more storage resources, the design need improve. Reference [6] presented a scalable pipeline FFT which needs to improve the precision.Therefore, after analyzing the various FFT algorithms, the paper presents a variable point FFT processor of a pipeline structure chosen the decimation –in-time FFT and 2D FFT algorithms. The paper was organized as follows: Section -2 details the Fast Fourier Transform and explains the 2D algorithm, Section-3 gives the detailed description of the overall design. In Section-4, the implementation and simulation will be detailed and Section-5 gives main conclusions of this work.

### **II. FAST FOURIER TRANSFORM**

#### 2.1 Discrete Fourier Transform (DFT)

The Discrete Fourier Transform (DFT) has seen wide usage across a larger number of fields. It is widely used in data compression, communications, GPS navigation, radar, image processing, voice processing, and sonar systems. All applications of the DFT depend crucially on the availability of a fast algorithm to compute Discrete Fourier Transform and their inverses, a Fast Fourier Transform.

The Discrete Fourier Transform (DFT) is of the N- point input X (k) is defined as follows:

 $X (k) = \sum_{n=0}^{N-1} x(n) W_N^{nk} \qquad k=0, 1....N-1$ (1)

### Vol. 1, Issue 3, pp.1125-1133

N is the transform length,

 $W_{N}^{mk} = e^{-j2\pi/N}$  is twiddle factor.

2.2 Fast Fourier Transform (FFT)

The Fast Fourier Transform is an efficient algorithm to compute the Discrete Fourier Transform and its inverse. It produces the same result exactly as evaluating the DFT definition directly; the only difference is that FFT is much faster. The most common FFT is the Cooley-Tukey algorithm. This is a divide and conquer algorithm that recursively breaks down a DFT of any composite size  $N=N_{\pm}N_{\pm}$  in to many smaller DFTs of sizes  $N_{\pm}$  and  $N_{\pm}$  along with O (N) multiplications with twiddle factors. The algorithm is divided in to time-based (DIT), frequency-based (DIF) Fast Fourier Transform. DIT FFT orders the data from bit reversal to normal order, where as DIF FFT converse. The basic idea of these algorithms are that the N point FFT is divided into smaller and smaller parts until only two points FFT. Hence this can be called as Radix-2 algorithm.

2.3 Two dimensional (2D) FFT

In communication applications long FFTs are quite often used. These long word lengths require more memory bandwidth for the matrix transpositions. The architecture is implemented using two shorter length FFTs lengths  $N_1$  and  $N_2$  to calculate an FFT of length  $N=N_1 \times N_2$ . The Two dimensional (2D) FFT of  $N=N_1 \times N_2$  is defined as follows:

$$\mathbf{X}[k_1N_2+k_2] = \sum_{n_1=0}^{N_1-1} \left[ e^{-\frac{j(n_12\pi k_2)}{N}} (\sum_{n_2=0}^{N_2-1} \left[ \mathbf{x}[n_2N_1 + n_1] e^{-j(n_2 \cdot 2\pi k_2/N_2)} \right] e^{-j2\pi k_1 n_2/N_1}$$
(2)

$$n_{2} k_{2}$$
 Each set of result 
$$n_{1} k_{1}$$
 FFT results  

$$N_{2} FFT$$
 Memory Multiplier 
$$N_{1} FFT$$
  

$$n_{2} N_{1} + n_{1}$$
 read column wise weights

Where  $0 \le k_1 \le N_1 - 1$ ;  $0 \le k_2 \le N_2 - 1$ ;

Figure 1. Block diagram for 2D FFT

### **III ALGORITHM**

Step 1: The top-level design file is created using a Hardware Description Language (HDL), such as VHDL. i.e. design entry

Step 2: simulate the design using Modelsim software.

Step 3: synthesize to generic gates.

Step 4: map to technology cells, optimize for speed/area by using Xilinx ISE 9.2 software

Step 5: The host computer with Chipscope Pro software is connected Spartan 3e with cable.

### Vol. 1, Issue 3, pp.1125-1133

Step 6: convert the logical design into a physical file format that can be downloaded to the selected target device using impact tool in Xilinx ISE 9.2i software.

Step 7: generate configuration files and download the programming files from a host computer to a Xilinx device.

Step 8: set the triggering options and view the waveforms by using ChipScope pro analyzer

### IV. ARCHITECTURE DESIGN

The architecture variable point FFT processor will be modeled in VHDL language and timing simulation using ModelSim software. This paper used 2D FFT algorithm to design the FFT processor, not only improved the system level of parallelism, but also reduced the demand for memory system.



Figure 2. Block diagram of the overall design of the FFT processor

### 4.1 Overall design of the FFT processor

The 8-point FFT is the core module in the design. This module is the most frequently used in the design. The FFT processor is the downlink module of the OFDMA system. The IEEE Std 802.16 2005 defined is a wireless MAN standard, commonly referred to as WiMAX. The standard clearly defined: the core module of OFDMA physical layer is the FFT module, which can be used in FFT points are 2048, 1024,512 and 128 points. The input is given in time domain by using the DDS core. This is a file that has been generated by Xilinx ISE software. The 64-point FFT can be obtained when the control signal is "00" with the help of pipeline structure of two 8-point FFT shown in Fig 3.

According to the idea of two dimensional (2D) Fourier algorithm, i.e. N=128,  $N_{\rm I}$ =2,  $N_{\rm I}$ =64. from 128=2\*64 we find that to achieve 128 point FFT, first the data is arranged in 64 lines and 2 rows ,next the input data will transform the 64 point FFT ,then the result multiply with twiddle factor and then the result do 2 point FFT[7]. According to Fig 2 the samples that are obtained from the DDS core are given to the 8-point FFT, the data will transform to 8-point FFT, and multiply with twiddle factors that are stored in twiddle factor module. Then in accordance to our requirement we can select the control signal either 00,01,10,11, for 64, 128, 512, 1024- point FFT respectively.

The 64-point FFT can be obtained when the control signal is "00" with the help of pipeline structure of two 8-point FFT shown in Fig 3. The 128-point FFT can be obtained by using the 64-point FFT, 2-point FFT, when the control signal "01".Similarly we can design 512-point FFT with the help of 128-point FFT, 4-point FFT, when control mode is "10".Finally the 1024-point FFT can be obtained with 512-point FFT and 2-point FFT. The whole procedure is based on 2D –FFT algorithm.

The overall design can be modeled using VHDL language, and timing simulation using ModelSim software.

### 4.2 8-point FFT module

### Vol. 1, Issue 3, pp.1125-1133

The 8-point FFT plays a crucial role in processor design .It consists of butterfly module, twiddle factor module and coefficient RAM. The results of this module can be verified by using MATLAB. By using the 8 point FFT pipeline structure 64 point FFT can be obtained. This can detail by using following block diagram.



Fig 3 Block diagram for 64 point FFT using two 8-point FFTs

### 4.3 Twiddle factor module

Twiddle factor module holds the twiddle factors which were calculated and these are multiplied with certain results.

#### Table .1 Twiddle factors

| 1 | Real $(W_8^0) = 1$                  | 0 000 0001 0000 0000 |
|---|-------------------------------------|----------------------|
|   | $\operatorname{Imag}(W_8^0) = 0$    | 0 000 0000 0000 0000 |
| 2 | Real ( $W_8^1$ ) = 0.707106781186   | 0 000 0000 1011 0100 |
|   | Imag ( $W_8^1$ ) = - 0.707106781186 | 1 000 0000 1011 0100 |
| 3 | Real $(W_8^2) = 0$                  | 0 000 0000 0000 0000 |
|   | Imag $(W_8^2) = -1$                 | 1 000 0001 0000 0000 |
| 4 | Real ( $W_8^3$ ) = - 0.707106781186 | 1 000 0000 1011 0100 |
|   | Imag ( $W_8^3$ ) = - 0.707106781186 | 1 000 0000 1011 0100 |
| 5 | Real ( $W_8^4$ ) = -1               | 1 000 0001 0000 0000 |
|   | Imag ( $W_8^4$ ) = 0                | 0 000 0000 0000 0000 |

4.4. Select and control module

This module is crucial to compute the alterable data length in the design. A two bits signal 'mode' is chosen as for the requirement .When mode=00, means to choose 8 point FFT module to complete the 64-point FFT, when mode= 01 means to choose 2-point FFT to complete 128-point FFT. Here the result of 64-point FFT is chosen to do 2-point FFT. Thus we obtain 128-point FFT. Similarly when the mode =10 means to complete 512 –point FFT, when mode=11 means to complete 1024 –point FFT.

### **V. SIMULATION AND IMPLEMENTATION**

### Vol. 1, Issue 3, pp.1125-1133

The input data length of our proposed FFT processor is a parameter which can be decided by itself at the range of 128, 512, 1024. Take 512 –point FFT as an example. At first, the 512 points FFT is coded by MATLAB language. After the chosen FFT algorithm is valid, the architecture of the processor is modeled in VHDL language and verified by Xilinx ISE and timing simulation using ModelSim software. A test bench file included namely tb\_DDS\_1024 mod.

The results of the simulation were seen to match the theoretical calculation of MATLAB result, error within the allowable range. Figure 4 shows the simulation results; Fig 5 gives bus plot of Chip Scope pro Analyzer.

### **VI. CONCLUSION**

In this paper, a variable point FFT processor was designed using FPGA and was applicable to OFDMA system successfully. The results showed that the successful completion of the design altered input points FFT computation, design precision 16-bit, FFT processing result was correct. Therefore this design can be applied to real-time signal processing system, completes the main computing modules in the OFDMA system. The OFDMA based systems are being adopted for use in different flavors of broadband cellular wireless



### systems(BCW).

Figure 4 Waveform of the input cosine



Figure 5 Waveform of the 1024-point FFT



Figure 6 Bus plot of the ChipScope Pro Analyzer

| ChipScope Pro Analyzer [FFT_OFDMA_chipscope_settings]                           |                                                             |     |         |                       |           |                       |                      |                           |               |  |  |
|---------------------------------------------------------------------------------|-------------------------------------------------------------|-----|---------|-----------------------|-----------|-----------------------|----------------------|---------------------------|---------------|--|--|
| Eile View JTAG Chain Device Trigger Setup Waveform Bus Plot <u>W</u> indow Help |                                                             |     |         |                       |           |                       |                      |                           |               |  |  |
| # 🕑 🕨 🔳 T! 🛛 🖓 🗘 🖓                                                              | PP                                                          |     |         |                       |           |                       |                      |                           |               |  |  |
| Project: FFT_OFDMA_chipscope_setti                                              | 🎯 Waveform - DEV:0 MyDevice0 (XC3S500E) UNIT:0 MyILA0 (ILA) |     |         |                       |           |                       |                      |                           |               |  |  |
| P-DEV:0 MyDevice0 (XC3S500E)     P-UNIT:0 MyILA0 (ILA)                          | Bus/Signal                                                  | X   | 0       | 0 1 2 3               | 4 5       | 567<br>               | 8910<br>             | <b>11 12 13</b>           | 14 15         |  |  |
| – Trigger Setup<br>– Waveform                                                   | ⁰~ mag_out                                                  | 177 | 177     | 142 ( 147 ) 159 ( 160 | ( 173 ( 1 | 7 ( 219 ( 177         | ( 175 ( 193 ( 210    | (198 )(219 )(223 )(       | 243 238       |  |  |
| - Listing                                                                       | ∽ fft_length                                                | 2   | 2       |                       |           |                       | 2                    |                           |               |  |  |
| Bus Plot                                                                        | ° x_in                                                      | 36  | 36      | -36 (-100 (-128 (-108 | ) -36 ) 3 | 6 ( 100 ( 124         | ( 108 ) 36 ( -36     | ( -100 )( -128 )( -108 )( | -36 ( 36 (    |  |  |
| Signals: DEV: 0 UNIT: 0                                                         | - DataPort[0]                                               | 1   | 1       |                       |           |                       |                      |                           |               |  |  |
| P Data Port                                                                     | - DataPort[1]                                               | 0   | 0       |                       |           |                       |                      |                           |               |  |  |
| ⊷ mag_out                                                                       | - DataPort[2]                                               | 0   | 0       |                       |           |                       |                      |                           |               |  |  |
| ⊷ x_in<br>— CH: 0 DataPort[0]                                                   | - DataPort[3]                                               | 0   | 0       |                       |           |                       |                      |                           |               |  |  |
| - CH: 1 DataPort[1]                                                             | - DataPort[4]                                               | 1   | 1       |                       |           |                       |                      |                           |               |  |  |
| — CH: 2 DataPort[2]<br>— CH: 3 DataPort[3]                                      | - DataPort[5]                                               | 1   | 1       |                       |           |                       |                      | [                         |               |  |  |
| <ul> <li>CH: 4 DataPort[4]</li> <li>CH: 5 DataPort[5]</li> </ul>                | - DataPort[6]                                               | 0   | 0       |                       |           |                       |                      |                           |               |  |  |
| – CH: 6 DataPort[6]                                                             | - DataPort[7]                                               | 1   |         |                       |           |                       |                      |                           |               |  |  |
| - CH: 7 DataPort[7]                                                             | - DataPort[8]                                               | 0   | 5       |                       |           |                       |                      |                           |               |  |  |
| — CH: 8 DataPort[8]<br>— CH: 9 DataPort[9]                                      | - DataPort[9]                                               | 0   | 0       |                       |           |                       |                      |                           |               |  |  |
| - CH: 10 DataPort[10]                                                           | - DataPort[10]                                              | 0   |         | 3/                    |           |                       |                      |                           |               |  |  |
| — CH: 11 DataPort[11]<br>— CH: 12 DataPort[12]                                  |                                                             |     | 4       |                       |           | 8                     |                      |                           |               |  |  |
| - CH: 13 DataPort[13]                                                           | - DataPort[11]                                              | 0   |         | 2                     |           |                       |                      |                           |               |  |  |
| - CH: 14 DataPort[14]<br>- CH: 15 DataPort[15]                                  | DataPort[12]                                                | 0   | 1.00    | -                     |           |                       |                      |                           |               |  |  |
| - CH: 16 DataPort[16]                                                           | - DataPort[13]                                              | 0   | 0       |                       |           |                       |                      |                           |               |  |  |
| — CH: 17 DataPort[17]<br>— CH: 18 DataPort[18]                                  |                                                             | ()) | ()      |                       |           | -                     |                      |                           | •             |  |  |
| - CH: 19 DataPort[19] - CH: 20 DataPort[20]                                     |                                                             |     |         |                       |           | X: 5                  | ● 0:5                | ▲▲(X-0): 0                |               |  |  |
| CH: 20 DataPort/201                                                             | Per                                                         |     |         |                       |           |                       |                      |                           |               |  |  |
| Upload                                                                          |                                                             |     |         |                       |           |                       |                      |                           | DONE          |  |  |
|                                                                                 | T.                                                          |     | -       |                       | Ter       |                       | Tana                 | fires.                    |               |  |  |
| 🦺 start 🔰 🕄 🙆 🎽 🕅 Mo                                                            | odelSim SE PLUS 6.2c                                        | wa  | ive - d | lefault 🔁 FFT_OFDMA   | 📓 C       | :\Xilinx92i\bin\nt\im | 🤹 MPACT - E:\ecg_ver | ChipScope Pro Analyz      | 🔇 🕵 🗭 6:46 PM |  |  |

Figure 7 Waveform of ChipScope Pro Analyzer

### **VII. REFERENCES**

- [1] E.Lawrey, Multiuser OFDM, in Proc. International Symposium on Signal Processing Applications '99 vol2, 1999, pp.761.764.
- [2] WiMAX Forum, Krishna Ramadas and Raj Jain,"WiMAX System Evaluation Methodology "version 2.1 July 2008
- [3] 3GPP TR 25.814 V.1.5, Physical Layer Aspects for Evolved UTRA(release 7), May 2006.
- [4] IEEE Std 802.16e 2005 and IEEE Std 802.16-2004/corl-2005

[5] Zhang Zhu-jun, "Optimum Design of FFT Processor with Cascade Structure Based on FPGA", J.Nanjing University of Science and Technology.2009.in press.

[6] Xie Yan-lin. "Design and Implementation of a scalable pipeline FFT Processor Based on FPGA" D.Xi Dian University 2007.in press

[7] WANG Xiu-fang, HOU Zhen-long "Design and Implement of FFT Processor for OFDMA system using FPGA" Northeast Petroleum University (ICMEE 2010)